N-gram Opcode Analysis for Android Malware Detection
نویسندگان
چکیده
Android malware has been on the rise in recent years due to the increasing popularity of Android and the proliferation of third party application markets. Emerging Android malware families are increasingly adopting sophisticated detection avoidance techniques and this calls for more effective approaches for Android malware detection. Hence, in this paper we present and evaluate an n-gram opcode features based approach that utilizes machine learning to identify and categorize Android malware. This approach enables automated feature discovery without relying on prior expert or domain knowledge for pre-determined features. Furthermore, by using a data segmentation technique for feature selection, our analysis is able to scale up to 10-gram opcodes. Our experiments on a dataset of 2520 samples showed achieved an f-measure of 98% using the n-gram opcode based approach. We also provide empirical findings that illustrate factors that have probable impact on the overall n-gram opcodes performance trends. Keyword: Android Malware, Malware Detection, Malware Categorization, Dalvik Bytecode, N-gram, Opcode, Feature Selection, Machine Learning.
منابع مشابه
Study of Dataset Feature Filtering of OpCode for Malware Detection Using SVM Training Phase
Malware can be defined as any type of malicious code that has the potential to harm a computer or network. To detect unknown malware families, the frequency of the appearance of Opcode (Operation Code) sequences are used through dynamic analysis. Opcode n-gram analysis used to extract features from the inspected files. Opcode n-grams are used as features during the classification process with t...
متن کاملAn investigation of the classifiers to detect android malicious apps
Android devices are growing exponentially and are connected through the internet accessing billion of online websites. The popularity of these devices encourages malware developer to penetrate the market with malicious apps to annoy and disrupt the victim. Although, for the detection of malicious apps different approaches are discussed. However, proposed approaches are not suffice to detect the...
متن کاملMalware detection: program run length against detection rate
N-gram analysis is an approach that investigates the structure of a program using bytes, characters or text strings. This research uses dynamic analysis to investigate malware detection using a classification approach based on N-gram analysis. A key issue with dynamic analysis is the length of time a program has to be run to ensure a correct classification. The motivation for this research is t...
متن کاملKullback-Leibler Divergence Based Detection of Repackaged Android Malware
Android applications are widely used by millions of users to perform many activities. Unfortunately, legitimate and popular applications are targeted by malware authors and they repackage the existing applications by injecting additional code intended to perform malicious activities without the knowledge of end users. Thus, it is important to validate applications for possible repackaging befor...
متن کاملR2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections
Machine Learning (ML) has found it particularly useful in malware detection. However, as the malware evolves very fast, the stability of the feature extracted from malware serves as a critical issue in malware detection. The recent success of deep learning in image recognition, natural language processing, and machine translation indicates a potential solution for stabilizing the malware detect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1612.01445 شماره
صفحات -
تاریخ انتشار 2016